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The combination of various physically plausible properties, such as no signaling, determinism, 
and experimental free will, is known to be incompatible with quantum correlations. Hence, these 
properties must be individually or jointly relaxed in any model of such correlations. The necessary 
degrees of relaxation are quantified here via natural distance and information-theoretic measures. 
This allows quantitative comparisons between different models in terms of the resources, such as the 
number of bits of randomness, communication, and/or correlation, that they require. For example, 
measurement dependence is a relatively strong resource for modeling singlet state correlations, 
with only 1/15 of one bit of correlation required between measurement settings and the underlying 
variable. It is shown how various 'relaxed' Bell inequalities may be obtained, which precisely specify 
the complementary degrees of relaxation required to model any given violation of a standard Bell 
inequality. The robustness of a class of Kochen-Specker theorems, to relaxation of measurement 
independence, is also investigated. It is shown that a theorem of Mermin remains valid unless 
measurement independence is relaxed by 1/3. The Conway-Kochen 'free will' theorem and a result 
of Hardy are less robust, failing if measurement independence is relaxed by only 6.5% and 4.5%, 
respectively. An appendix shows the existence of an outcome independent model is equivalent to 
the existence of a deterministic model. 

PACS numbers: 03.65.Ta 



I. INTRODUCTION 

Bell inequalities and Kochen-Specker theorems demon- 
strate that at least one very plausible property - such as 
no signaling, determinism, or measurement independence 
- does not hold in a world that exhibits quantum corre- 
lations . Any model or simulation of quantum sys- 
tems must, therefore, give up at least one such property. 
But how much must be given up ? Is 20% indetermin- 
ism sufficient to maximally violate a Bell inequality? Is 
a combination of 5% signaling and 10% measurement de- 
pendence enough to simulate singlet state correlations? 

The question of the degree to which such properties 
must be relaxed is of fundamental interest in construct- 
ing physical theories. It is also relevant to understanding 
so-called 'quantum nonlocality' as a physical resource, in 
tasks such as quantum computation and secure quantum 
cryptography. For example, singlet state correlations can 
be modeled by giving up 100% of determinism [l^, or 
14% of measurement independence (related to the free- 
dom to choose experimental settings) [l3|. Hence, in- 
determinism appears to be a weaker 'nonlocal' resource 
than experimental free will, for simulating the singlet 
state. 

The main aim of this paper is to carefully define and 
quantify the degrees to which certain physical proper- 
ties hold for a given model of correlations, and show how 
these may be applied to determine (i) optimal singlet 
state models; (ii) the minimal degrees of relaxation re- 
quired to simulate violations of various Bell inequalities, 
and (iii) the relative robustness of Kochen-Specker theo- 
rems. 

The physical properties considered are precisely those 
which are brought into question by the existence of quan- 



tum correlations. The quantitative nature of the results 
helps considerably to clarify the nature of these correla- 
tions, as well the resources required for their simulation. 

The general form of underlying (or 'hidden variable') 
models of statistical correlations is recalled in Sec. II, 
and the degrees to which such underlying models pos- 
sess a number of physically plausible properties, such as 
determinism, outcome independence, no signaling and 
measurement independence, are defined and discussed 
in Sees. III-V. Both statistical and information-theoretic 
based measures are considered. These sections, together 
with Appendix A, also demonstrate that the proper- 
ties of determinism and outcome independence are ef- 
fectively equivalent, and relate the degree of communi- 
cation required to implement a given nonlocal model to 
the amount of signaling permitted by the model. 

In Sec. VI it is demonstrated that there are three 
canonical models of singlet state correlations, corre- 
sponding to the minimal degrees to which one of the 
above mentioned properties must be relaxed while main- 
taining the others. The corresponding information- 
theoretic resources required are 1 bit of randomness gen- 
eration or outcome correlation, 1 bit of signaling or com- 
munication, and 1/15 of one bit of correlation between 
the underlying variable and the measurement settings. 

It is shown in Sec. VII, together with Appendices B 
and C, how to derive 'relaxed' Bell inequalities. These 
precisely quantify the individual and/or joint degrees of 
relaxation required to model a given violation of a stan- 
dard Bell inequality. Examples include the joint relax- 
ation of determinism, no signaling and measurement in- 
dependence for the Bell-CHSH inequality [2], verifying 
a recent conjecture p^ : the relaxation of outcome inde- 
pendence for the same inequality; and the relaxation of 
indeterminism and no signaling for a form of the 13322 
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inequality [6j. 

Sec. VIII shows how local deterministic models may 
be obtained for the perfect correlations underlying mem- 
bers of a strong class of Kochen-Specker theorems 
These models require the relaxation of measurement in- 
dependence, and the minimal degree of relaxation quanti- 
fies the relative robustness of such theorems. It is found 
that a version due to Mermin [10] is the most robust, 
requiring relaxation by 1/3. 

Conclusions are given in Sec. IX. 

II. UNDERLYING MODELS 

Consider a given set of statistical correlations, 
{p{a, b\x, y)}, where the pair (a, b) labels the possible out- 
comes of a joint experiment (x, y), for some fixed prepa- 
ration procedure. Any underlying model of these corre- 
lations introduces an underlying variable A on which the 
correlations depend, which is typically interpreted as rep- 
resenting information about the preparation procedure. 
From Bayes theorem one has the identity 

p{a, b\x, y) J dXp{a, b\x, y, A) p{X\x, y), (1) 

with integration replaced by summation over any discrete 
ranges of A. A given underlying model specifies the type 
of information encoded by A, and the underlying proba- 
bility densities p{a, b\x, y, A) and p{X\x, y). 

For example, the standard Hilbert space model of 
quantum correlations represents the underlying variable 
by a density operator, p, and the joint measurement set- 
ting by a probability operator measure, with 

p{a,b\x,y,p) = tT:[pEll], p{p\x,y) = S{p - po). (2) 

One may alternatively use a pure state model, of the form 

p{a, b\x, y, ip) = {ip\E'^jl\il;), p{ip\x, y) = po{ip), 

where A is restricted to the set of unit vectors on 
the Hilbert space, and the models are related by po = 

jdi^poii^mi^i _ 

A given underlying model may or may not satisfy var- 
ious physically plausible properties, such as no signaling, 
determinism, outcome independence, etc. The violation 
of Bell inequalities and Kochen-Specker theorems, by cer- 
tain quantum correlations, implies that at least one such 
property must be relaxed by any model of these correla- 
tions. The necessary degrees of relaxation are the central 
concern of this paper, and help both to clarify and quan- 
tify the nonclassical nature of quantum entanglement. 

These properties are defined in Sees. III-V below, and 
natural measures of the degree to which they hold, for a 
given model, are defined. These measures can generally 
be expressed in terms of the variational distance between 
two probability distributions P and Q, 

D{P,Q) :=^|P(n)-Q(n)|, 

n 



or in terms of Shannon entropy and mutual information. 
While the distance measures are typically easier to work 
with, the information-theoretic measures have the ad- 
vantage of directly quantifying various resources, such 
as randomness, correlation information, and communi- 
cation capacity. 

III. DETERMINISM AND OUTCOME 
INDEPENDENCE 

A. Physical significance 

Determinism is the property that all outcomes can be 
predicted with certainty, given knowledge of the under- 
lying variable A, i.e., p{a,b\x,y,\) — or 1. This is 
easily shown to be equivalent to the property that all 
underlying marginal probabilities are deterministic, i.e., 
to 

p(a|x,y,A),p(&|x,2/,A) e {0,1}. (3) 

In contrast, outcome independence is the property that, 
given knowledge of the underlying variable A, the joint 
measurement outcomes are uncorrelated (l6| . i.e., 

p{a,b\x,y,\) ^ p{a\x,y,\)p{b\x,y,\). (4) 

Thus, any observable correlations arise only as a conse- 
quence of ignorance of the underlying variable. 

Any deterministic model is trivially outcome indepen- 
dent (see Appendix A), and so it may appear that de- 
terminism is a more restrictive property. However, as 
shown in Appendix A, the difference between these two 
properties is largely cosmetic: for any set of statistical 
correlations, {p{a,b\x,y)} , there exists an underlying de- 
terministic model M if and only if there exists an un- 
derlying outcome independent model M' . Furthermore, 
M satisfies no-signaling or measurement independence if 
and only if A4' does. 

At least two plausible arguments may be made for the 
existence of an underlying deterministic (and hence out- 
come independent) model of physical correlations. The 
first is based on a 'realist' interpretation of probability, 
in which the assignation of probabilities to measurement 
outcomes merely refiects ignorance as to an underlying 
'real state of affairs'. This implies an underlying de- 
terministic model for the outcomes, where p{X\x, y) in 
Eq. (H)) describes ignorance of the precise state of affairs. 

This argument is easily countered by adopting a non- 
realist interpretation of probability, with measurement 
considered to be an act of creation rather than one of 
revelation |l3, [3- Indeed, Bohr stated that "we have in 
each experimental arrangement . . . not merely to do with 
the ignorance of the value of certain physical quantities, 
but with the impossibility of defining these quantities in 
an unambiguous way" |18| . For example, one may adopt 
a Bayesian interpretation of probability, where probabili- 
ties reflect consistent methods for making predictions on 
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the basis of given knowledge |19j , without requiring the 
existence of some underlying 'perfect' knowledge. 

The second main argument for determinism is based 
on the existence of perfect correlations. In particular, as 
first pointed out by Einstein, Podolsky and Rosen j20| . 
perfect quantum correlations can exist between the out- 
comes corresponding to a given joint measurement set- 
ting (x, y). Thus, knowledge of the outcome for setting x 
immediately implies knowledge of the outcome for setting 
y, and vice versa. If no signaling between the two mea- 
surement regions is permitted, it immediately appears 
that the outcomes must have been predetermined - how 
else could such a perfect correlation be realised ? Since 
quantum mechanics does not assign deterministic values 
to these outcomes, some underlying model must then do 
so. This argument was also used by Bell in obtaining the 
original Bell inequality [ij. 

However, this argument may also be countered, even 
when no signaling is assumed. For example, in the many- 
worlds interpretation of quantum mechanics, the two ob- 
servers may in fact obtain random outcomes that do not 
always satisfy the predicted correlation - in which case 
they will simply end up in different branches of the uni- 
versal wave function, unable to compare their inconsis- 
tent results 2l|. In Bayesian interpretations, the rebuttal 



is that the correlations are a property of degrees of belief 
of observers (which may be informed by quantum mod- 
els) , rather than of some physical state per se, where any 
knowledge gained about one outcome from the other out- 
come (eg, due to a perfect correlation) merely reflects a 
local and consistent updating of either observer's degree 
of belief M. 



B. Indeterminism and outcome dependence 

The degree of indeterminism of an underlying model 
may be defined as just how far away the marginal prob- 
abilities can be from the deterministic values of and 1 
in Eq. ([3]). This is the smallest positive number, /, such 
that 



p{a\x,y,X),p{b\x,y,X) G [0, /] U [1 - /, 1]. 



(5) 



Thus, < / < 1/2, with / = if and only if the prob- 
abilities are confined to {0, 1} as per Eq^ ([3]), i.e., if and 
only if the model is deterministic [l^ . 

A simple measure of outcome dependence, O, is the 
maximum variational distance between an underlying 
joint distribution and the product of its marginals, i.e.. 



O 



sup 

x,y,\ 



|p(a, b\x, y. A) - p{a\x, y. A) p{b\x, y. A) I 



(6) 

Thus, < O < 2, and it follows immediately from Eq. (g]) 
that O = if and only if outcome independence is satis- 
fied. 

As noted above, the properties of determinism and out- 
come independence are closely related. For example, as 



shown in Appendix A, for the particular case of two- 
valued outcomes one has the tight inequality 



0<4/(l-/) < 1. 



(7) 



This inequality chain is saturated, for example, by the 
singlet state of two qubits (see Sec. VI), and by PR- 
boxes psj . In both cases one has the maximum possible 
degrees of indeterminism and outcome dependence, i.e., 
/ = 1/2 and = 1. 



C. Random bits and outcome correlation 

Indeterminism corresponds to a degree of randomness. 
Hence, a natural information-theoretic measure of inde- 
terminism is given by the maximum entropy of the un- 
derlying marginal probability distributions: 

:= sup {Hx,y,\(A),Hx.y.\{B)}, (8) 

where Hx^y^\{A) denotes the Shannon entropy of the out- 
come distribution {p{a\x , y , X)} . Thus, Crandom is the 
maximum number of random bits that must be gen- 
erated to simulate a local outcome distribution, and 
Crandom — for deterministic models. Since there is 
an underlying marginal probability arbitrarily close to /, 
one has the lower bound 

> HI), (9) 

with equality for the case of two- valued outcomes, where 

h{x) := — a;log2 x — {1 — x) log2(l — x). (10) 

A corresponding information-theoretic measure of out- 
come dependence is given by the maximum Shannon mu- 
tual information between the outcomes: 



Coutcome ■— SUp Hx.y.\{A : B) 

x,y,\ 

= sup J2ab P{0',b\x,y,X)\og2 



(11) 



p{a,b\x,y,X) 



x,y,\ 



p{a\x,y,X)p{b\x,y,X)' 



This quantifies the maximum degree of correlation that 
is present between measurement outcomes, given knowl- 
edge of the underlying variable A [13], and vanishes for 
models satisfying outcome independence via Eq. 
One has the relations 



Crand' 



om ^ Coutcome 2^ 2^ ^^§2 



(12) 



where the upper bound follows from Eq. ([5]), and the 
(nontight) lower bound from Pinsker's inequality f25j . 
For the case of two-valued measurement outcomes this 
lower bound can be improved to the tight bound 



Co 



1 + 



(13) 



in analogy to Eq. ([9]). In the standard Hilbert space 
model of singlet state spin correlations the maximum 
possible values for two-valued outcomes, Crandom = 
Coutcome = 1 bit, are achieved (see Sec. VI). 
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IV. NO SIGNALING 



A. Physical significance 



The property of no signaling (or parameter indepen- 
dence) is satisfied if the underlying marginal distribution 
associated with one setting is independent of the other 
setting, i.e., if 

p{a\x, y, A) = p{a\x, y', A), p{b\x, y, A) = p{a\x' , y, A) 

(14) 

for all joint settings {x, y), {x, y') and (a;', y) of the model. 
Thus, neither observer can affect the underlying measure- 
ment statistics of the other, via their choice of measure- 
ment setting. Hilbert space models satisfy this property 
when the measure in Eq. ([2|) has the tensor product form 
El® El 

There are two strong arguments for requiring physical 
models to have the no signaling property. The first ap- 
plies when the respective measurement settings are made 
in spacelike separated regions: altering the underlying 
statistics of a measurement in one such region, via vary- 
ing a measurement setting in the other region, would vi- 
olate the principle of relativistic causality and thus lead 
to the need to resolve various paradoxes. 

The second argument is that any signaling model un- 
derlying quantum correlations would have to explain 
the apparent 'conspiracy' that quantum correlations are 
themselves nonsignaling. In particular, all nonzero shifts 
in the underlying probability distributions, for any such 
underlying model, would have to average out to zero at 
the observable level. 

However, while relativistic causality is a natural as- 
sumption, it still may be possible to consistently resolve 
apparent paradoxes if it does not hold. Furthermore, it 
is often possible to transform 'conspiracies' into well mo- 
tivated 'physical principles'. Thus, for example, in the 
deBroglie-Bohm model of quantum mechanics one can 
either postulate a typical universal initial state [1^, or 
the existence of suitably smooth intial conditions relative 
to some degree of coarse graining [27,] . 



B. Signaling 

The degree of signaling is quite simply defined as the 
maximum possible shift in an underlying marginal prob- 
ability for one observer, as the consequence of changing 
the measurement setting of the other observer. More for- 
mally, one-way degrees of signaling are defined by [l5l | 

Si^2 ■■= sup \p{b\x,y,\)-p{b\x',yX)\, 

{a;, a;',!;, 6, A} 

82^1 ■■= sup \p{a\x,y,X) -p{a\x,y'X)\, 

where a and b label measurement outcomes correspond- 
ing to measurement settings x and y, respectively. Thus, 
for example, 81^2 is the maximum possible shift in an 



underlying marginal probability distribution for the sec- 
ond observer, induced via changing a measurement set- 
ting of the first observer. If S'i_j.2 > and A is known, the 
first observer can in principle communicate to the second 
observer merely by modulating the local measurement 
setting. 

The overall degree of signaling, for a given underlying 
model, is defined by 



S :— max{S'i^2, 'S'2^i}- 



(15) 



It foUow^that < 5 < 1, and S" = for nonsignaling 
models |2i]. 

The degrees of indeterminism and signaling, / and S, 
are not fully independent of one another. For example, in 
a deterministic model the underlying marginal probabil- 
ities are restricted to the values and 1, and hence only 
a probability shift of unity is possible between these val- 
ues. More generally, any shift S' in a marginal probability 
value must keep it in the range [0, /] U [1 — /, 1], i.e., the 
value must either stay in the same subinterval {S < /), 
or cross the gap between the subintervals (5* > 1 — 21). 
Hence, 



I >mm{S,{l- S)/2}. 



(16) 



In contrast, the degree of outcome dependence, O, is com- 
pletely independent of S. 



C. Signaling capacity 

The maximum signaling capacity of a given model is, 
in analogy to Eq. given by 

C,,g sup{H,,a(A : Y),Hy^x{B : X)}, (17) 

where Hx^\{A : Y) denotes the Shannon mutual infor- 
mation between the measurement outcome of the first 
observer and the measurement setting of the second ob- 
server, for fixed x and A. Thus, Cgig directly quantifies 
the amount of information that may be transmitted be- 
tween observers via appropriate choices of measurement 
settings [24 1. 



The two measures S and Csig are related via [Tsj 



Csig ^ 1 



h{ 



l + S 



(18) 



analogous to Eqs. ^ and (fT3)) . Thus, nonlocal commu- 
nication is always possible, in principle, if 5 > 0. 

For example, the standard Hilbert space model in 
Eq. (2)) is nonsignaling, with S = Csig = 0. On the 
other hand, for the deterministic Toner-Bacon model of 
the singlet state |29!], one has 5* = 1, since the probabil- 
ity of one observer's outcome can fiip between and 1, in 
dependence on the choice of measurement made by the 
first observer. Noting that the right hand side of Eq. p7)) 
cannot be greater than than 1 for two-valued measure- 
ments, it follows via Eq. ([T5)) that Csig = 1 bit for this 
model. 
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D. Relation to communication models 

The signaling capacity of a model is, prima facie, a dif- 
ferent concept to the degree of nonlocal communication 
required to simulate a given model. The signaling capac- 
ity is the amount of information which the observers are 
able to exploit, in principle, for arbitrary communication 
once the model is in place. In contrast, the communi- 
cation capacity may be defined as the amount of infor- 
mation required to be transmitted between observers to 
simulate the model. The connections between the two 
concepts are explored and clarified below, in the context 
of one-way communication models. 

In a one-way communication model, a message m is 
communicated from the first observer to the second ob- 
server, which may depend on the measurement setting x 
and a shared underlying variable A [s^l. The message is 
used to generate outcomes for the second observer, such 
that Eq. ([T]) is satisfied. 

For example, in the Toner-Bacon model of the singlet 
state one has [29| 

m = /(x. A) := (sgna; • Ai) (sgna; • A2), 

where the underlying variable A = (Ai, A2) comprises two 
unit vectors Ai and A2 uniformly distributed over the 
unit sphere. The corresponding measurement outcomes 
are deterministically generated as a = — sgnx • Ai and 
b — sgny • (Ai -I- TOA2), for spin directions x and y. 

Since A is known by both observers, the maximum in- 
formation obtainable from to, about the measurement 
setting and outcome of the first observer, is given by the 
mutual information H\{M : X,A). Since to is the only 
communication used to generate the underlying corre- 
lations, this information must subsume any information 
obtainable from the outcome b for any measurement set- 
ting y of the second observer. Hence, 

HxiM ■.X,A)> snpHy^xiB ■.X,A)> snp Hy^x{B : X). 
y y 

(19) 

The communication model will be said to be nonredun- 
dant if strict equality holds. 

The communication capacity is defined to be the max- 
imum possible mutual information that is communicated 
about X and a via the message to, i.e.. 



C, 



commun SUp Hx{M : X , A) . 
X 



(20) 



It follows immediately via Eqs. (IT71) and (IT^ . recalling 
the communication is one-way only, that 



comrnw 



(21) 



with equality for nonredundant models. 

For a deterministic communication model (such as the 
Toner-Bacon model), the message and the outcome of the 
first observer are completely specified by x and A, i.e., 

p{m, x, a\X) = S,n,f{x,\)SaMx.\) p{x\X), 



for suitable functions / and a. Hence, Hx{M : X,A) 
Hx{M), and Eq. ^ simplifies to 



Cdeterm commun ^ Slip IIx(^Al^ 
X 



(22) 



for such models, i.e., the communication capacity is just 
the maximum possible entropy of the message. 

As an example, consider the Toner-Bacon model de- 
scribed above. If the distribution of measurement set- 
tings of the first observer, p{x), is uniformly distributed, 
then Hx{M) — h{7T~^ cos~^ Ai • A2), with h{x) defined as 
in Eq. ^ [29]. This is equal to 1 bit for Ai • A2 = 0. 
This is the maximum possible entropy Hx in Eq. 
since to only takes two values. Hence, 



C 



TB 

commun 



= 1 bit 



for this model. Note this also follows from Eq. (|2ip . 
since Cgig = 1 from the previous section. An example 
of an indeterministic communication model is discussed 
in Sec. VII A. 

Toner and Bacon have numerically calculated the av- 
erage of Hx{M) over A, for the case of a uniform dis- 
tribution p[x), as w 0.85 bits. As a consequence of the 
deterministic nature of the model, one further finds 



H{M,A: X) = {Hx{M)) « 0.85 bits 



(23) 



for this case. In contrast, H{M : X) = whenever the 
first observer's setting is independent of A, i.e., p{x\X) = 
p{x), implying no information can be gained about this 
setting from the knowledge of to alone. 



V. MEASUREMENT INDEPENDENCE AND 
EXPERIMENTAL FREE WILL 

A. Physical significance 

Measurement independence is the property that the 
distribution of the underlying variable is independent of 
the measurement settings, i.e.. 



(24) 



for any joint settings (a;, y), (x', y'). It is trivially satisfied 
by the quantum model in Eq. It follows immediately 
via Bayes theorem that this property is equivalent to each 
of 

p{x,y\X) ^p{x,y), p{x,y,X) p{x , y) p{X) , 

whenever there is a well defined distribution, p(x,y), of 
joint measurement settings [3l| . 

Measurement independence, particularly in the form 
p{x, y\X) = p{x, y), is often justified by the notion of 'ex- 
perimental free will', i.e., that experimenters can freely 
choose between different measurement settings irrespec- 
tive of the underlying variable A describing the system. 
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More neutrally, if random number generators are used to 
determine the measurement settings, it may be argued 
that the physical operation of these generators should be 
independent of the underlying variables describing the 
system that is to be measured. 

However, there is no a priori physical reason why the 
behaviour of experimenters or random generators should 
not be statistically correlated with a given system to 
some degree, reflecting a common causal dependence on 
some underlying variable. For example, as has been 
clearly pointed out in the quantum context by Brans [s^ , 
any fundamental deterministic model underlying nature 
should certainly predict the joint measurement settings 
(which are, after all, physical phenomena), to the same 
degree as it predicts the measurement outcomes. 

Further, a violation of measurement independence is 
not automatically inconsistent with apparent experimen- 
tal freedom. For example, suppose two experimenters run 
a series of experiments where they aim to choose their 
joint measurement settings according to some predeter- 
mined joint probability distribution p{x, y). For example, 
they might use random number generators to choose be- 
tween local settings according to some factorisable joint 
distribution p{x, y) = p{x)p(y). It might be argued that 
an underlying correlation, between the joint settings and 
some underlying variable A, could prevent such a pre- 
arranged joint distribution from being realised. However, 
this is not so: such a realisation merely restricts the joint 
distribution oi x, y and A to be 

P{x,y,\) = p{\\x,y)p{x,y), (25) 

irrespective of whether or not measurement independence 
is satisfied. 

Finally, it may be mentioned that the violation of mea- 
surement independence is natural for retrocausal models, 
in which future measurement settings may influence the 
past statistics of the underlying variable. While retroca- 
suality is counter-intuitive in allowing two directions of 
time. Price has shown it is surprisingly robust to para- 
doxes [33| . However, of course, one does not require 
retrocausality to violate the measurement independence 
property in Eq. (|24p [32j . 



B. Measurement dependence and correlation 

The degree to which an underlying model violates mea- 
surement independence is most simply quantified by the 
variational distance 

M:= sup j d\\p[\\x,y)-p{\\x\y')\. (26) 

x.x' ,y,y' J 

Thus, M = when Eq. ((24|) holds. In contrast, a maxi- 
mum value of M = 2 implies that there are at least two 
particular joint measurement settings, {x,y) and (x',y'), 
such that for any physical state A at most one of these 



joint settings is possible. Hence, the observers can exer- 
cise no experimental free will whatsoever to choose be- 
tween the joint settings in this case. Such a model has 
been given by Brans for any state of two qubits, where 
the underlying variable A in fact completely determines 
the joint measurement settings 32] (this model easily 
generalises to any set of statistical correlations). Indi- 
vidual degrees of measurement dependence. Mi and M2, 
may also be defined for each observer but will not 
be considered here. 

The fraction of measurement independence corre- 
sponding to a given model is defined by 

F -.^1- M/2. (27) 

Thus, < < 1, with F — corresponding the case 
where no experimental free will can be exercised to choose 
between two particular settings. Note that, geometri- 
cally, F also represents the minimum degree of overlap 
between any two underlying distributions p{X\x,y) and 
p{X\x',y'). 

A natural information-theoretic characterisation of the 
degree of measurement dependence has been recently 
proposed by Barrett and Gisin [33 |. In particular, the 
mutual information between the measurement settings 
and the underlying variable, 

H{X,Y:A) = J2 fdXp{x,y,X)\og,^^^^^ 
j^J p{x,y)p{X) 

quantifies the degree of correlation between the joint 
measurement setting and the underlying variable [23]. 
It is well-defined whenever the joint distribution p(x, y) 
exists [U, with p{x,y,X) given by Eq. (|^ . 

For models satisfying measurement independence, 
there is no correlation and the mutual information van- 
ishes via Eq. In contrast, for the Brans model of 
two qubits j32], where the hidden variable uniquely de- 
termines the joint measurement setting, there is perfect 
correlation, and the mutual information can become in- 
finitely large (eg, for the case of randomly chosen settings 
with p{x, y) = l/(47r)2). 

The measurement dependence capacity of a given model 
may be defined by maximising the mutual information 
over all possible distributions of measurement settings: 

Cmeas dep sup H {X , Y : A) . (28) 

Barrett and Gisin have shown the existence of deter- 
ministic nonsignaling models of the singlet state with 
Cmeas dep < 1 bit [34]. It will be shown in the follow- 
ing section that a recently proposed model of this type 
has Cmeas dep — 0.0663 bits, i.e., no more than ^ 1/15 
of one bit of mutual information is required to reproduce 
all spin correlations, for any distribution p{x, y) of exper- 
imental settings. 
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VI. MINIMAL SINGLET STATE MODELS 

To indicate how the above introduced measures ahow 
quantitative comparisons between different models, three 
fundamental models of the singlet state correlations 



p{a,b\x,y) = ^ (1 - O'bx- y) 



(29) 



are briefly examined here, where a,b = ±1 denote spin- 
up and spin-down outcomes for measurements in direc- 
tions X and y respectively. 

Each of the three models corresponds to the minimum 
possible relaxation of one of the properties of determin- 
ism, outcome independence, no signaling, and measure- 
ment independence, while retaining the others. It will 
be seen that measurement dependence is a particularly 
strong resource for modeling quantum correlations. 



A. Relaxing determinism 

First, consider the class of singlet state models which 
only relax determinism and/or outcome independence, 
i.e., for which S = M = Q. The canonical member of this 
class is the standard Hilbert space model. As noted in 
Sec. Ill, this model has the maximum possible degrees of 
indeterminism and outcome dependence. 



HS 



1/2, O 



HS 



1, 



(30) 



as well as the maximum possible number of locally gen- 
erated random bits and outcome correlation. 



HS 

random 



_ f-iHS _ 1 i^-i 

~ ^outcome ~ 



(31) 



The above properties in fact hold for any model of the 
singlet state satisfying no signaling and measurement in- 
dependence. That is, if only determinism (or outcome 
independence) is relaxed, then it must be relaxed com- 
pletely, to model all singlet state correlations. 

In particular, a strong result by Branciard et al. states 
that any underlying model of the singlet state with S ~ 
M = must almost always predict a 50:50 chance of spin 
up or down in any direction, i.e, 

v{a\x,X) = i =p{b\y,X) 

for all A, except possibly on a set of total probability 
zero [13|. This immediately implies via Eqs. ([5)) and ^ 
that 7 = 1/2 and Crandom = 1 bit, as claimed. It fur- 
ther implies, using the notation of Eq. (|Aip . that the 
joint probability distribution p{a, h\x, y, A) is of the form 
(ca, 1/2 — ca, 1/2 — ca, ca) for almost all underlying vari- 
ables, with < CA < 1/2 (note that the singlet state 
correlation in Eq. is also of this form). But for the 
case X = y one has, via Eqs. ([1]) and 



p{a ~ b\x, x) 







dXp{a — b\x, X, X) p{X\x, x) 



— 2 dXc\p{X\x,x). 



Hence, ca = for this case with probability unity, i.e., 
the joint distribution is of the form (0,1/2,1/2,0). It 
immediately follows via Eqs. ([6]), ([7]) and (fT3l) that = 1 
and Coutcorne = 1 bit, as claimed. 



B. Relaxing no signaling 

The class of singlet state models which only relax no 
signaling, with / = Af = 0, are represented by the Toner- 
Bacon model As noted in Sec. IV C, this model in 
fact has the maximal possible degree of signaling, i.e. 



= 1, 



Cl'^ = 1 bit. 



(32) 



These properties in fact hold for all deterministic mea- 
surement independent models of the singlet state, and 
hence the Toner-Bacon model is a canonical representa- 
tive of such models. 

To demonstrate the generic nature of Eq. (15^ for 
I = M = 0, note first from Eq. (IT51) that for deter- 
ministic underlying models one must either have 5 = 
or S* = 1. But there are no singlet state models hav- 
ing / = S' = M = o[l|. Hence, 5* = 1, as claimed. 
This immediately implies that there is some particular 
underlying variable. A, for which the marginal underly- 
ing probability of one observer shifts between the values 
of and 1, in dependence on which one of two measure- 
ment settings is selected between by the other observer. 
Selecting between these settings with equal prior prob- 
abilities allows transmission of 1 bit of information per 
measurement, in agreement with Eq. (jl8p . Since this is 
the maximum possible for two-valued measurement out- 
comes, if follows that Csig = 1 bit, as claimed. 



C. Relaxing measurement independence 

It is seen from the above that, when relaxed individ- 
ually, determinism or no signaling must be completely 
relaxed to model the singlet state (as must outcome in- 
dependence). It has recently been conjectured that, when 
jointly relaxed, the degrees of indeterminism and signal- 
ing must satisfy the complementarity relations [isl [35j 



s' + 2/>i, a 



random 



Csig > 1 bit. 



(33) 



Thus, it appears that at least 1 bit of total resources is 
required for any measurement independent model of the 
singlet state. In contrast, if instead measurement inde- 
pendence is relaxed, only 1/15 of a bit is required, as 
will be shown below. Measurement dependence is, there- 
fore, a relatively strong resource for simulating quantum 
correlations. 

In particular, for / = 5 = 0, a singlet state model 
has been recently given with deterministic local outcomes 
a = sgnx ■ X and b = — sgny - A, for measurement di- 
rections X and ?/, where A denotes a unit 3- vector with 



8 



probability density [l^ 

1 ~t~ X ' y 

p{\\x, y) := — — ^ for sgn a; • A = sgn y • A, 

1 — X ' y 

:= — — for sgn x ■ \ ^ sgn y ■ A. (34) 

o(Pxy 

Here (j>xy € [0, tt] denotes the angle between these di- 
rections, and the density is defined to be zero when the 
denominators vanish. The degree of measurement depen- 
dence for this model is given by [3| 

Ms^nglet = 2(^2 - l)/3 « 0.276, (35) 

corresponding to a fraction of measurement indepen- 
dence Fsingiet ~ 86% in Eq. p7|. It will be shown in 
Sec. VII that these are, respectively, the smallest possible 
and largest possible values of M and F, for any deter- 
ministic nonsignaling model of the singlet state. Hence 
this model is minimal, with a degree of relaxation of only 
14% of measurement independence required. 

To calculate the corresponding measurement depen- 
dence capacity Cmeas dep in Eq. (|28p . note first that the 
entropy of the probability density p[\\x,y) is given by 

1 ~t~ X ' y 1 
Hxy{h.) = h{ ) + -{l-x-y)\og2(j)xy 

+ ^(1 + 2; ■ y) l0g2(7r - (l)xy) + log2 4. 

This has a maximum value of Hmax = log47r sa 3.65145 
bits (achieved for x • ?/ = 0, ±1), and a minimum value of 
^^min ~ 3.58521 bits (for x ■ y fa ±0.9148, corresponding 
to an angle (p^y « 24 or 156 degrees). Thus, the proba- 
bility density is always very close, in the sense of entropy, 
to the uniform density 1 / (47r) , for any joint measurement 
setting. 

It follows that the mutual information between the 
measurement settings and the underlying variable is 
given by 

HiX,Y:A) = H{A)~ J dxdyp(x,y)Hxy{K) 

< l0g2^TT - Hmin, 

where the last inequality is an immediate consequence 
of the entropy of A being maximised by a uniform dis- 
tribution on the sphere. Moreover, the inequalities are 
saturated, for example, by choosing p(x, y) such that x is 
uniformly distributed on the sphere and, for each value of 
X, y is uniformly distributed on the circle x ■ y ~ 0.9148. 
This choice immediately gives Hxy{h) — Hmin, while 
the rotational symmetry of p(A|a;,y) in Eq. (I34p yields 
p(A) = l/(47r), and hence H{A) = log2 47r. The measure- 
ment dependence capacity of the model is, therefore, 

Cmeas dep = logz 47r - Hmtn ~ 0.0663 bits. (36) 

This value, about 1/15 of a bit, is seen to be relatively 
small in comparison to the 1 bit required when either 



determinism or no signaling is relaxed, as well as to the 
general bound of 1 bit obtained for such models by Bar- 
rett and Gisin [3^ . 

It of interest to calculate the mutual information 
H{X,Y : A) for this model in two particular scenar- 
ios: when the measurement settings are chosen uniformly 
from the unit sphere, and when the measurement settings 
arc chosen randomly from the 4 settings corresponding 
to maximum violation of the Bell-CHSH inequality. 

In the first case p{x, y) — l/(47r)^, leading via Eq. ([M)) 
top(A) = l/(47r). Hence, 

^uniform 

{X; r : A) = log2 Ait - {Hxy{A)) « 0.0280 bits. 

This value, about 1/36 of a bit, may be favourably com- 
pared to the corresponding values of 0.85 and 0.28 bits 
in the corresponding models given by Barrett and Gisin 

MM- 

In the second case, the four CHSH settings {x,y), 
{x,y'), {x',y) and {x',y') are defined by measurement 
directions x,y,x' ,y' lying on a great circle, consecutively 
separated by 45 degrees [2| . One finds by straightforward 
calculation that Hxy{A) = \og2n + H{q/3,q/3,q/3, l — q) 
for each setting, where the second term denotes the en- 
tropy of the distribution defined by its arguments and 
g = (1 + 1/a/2)/2. One further finds H{A) = log^An, 
yielding 

IcHSHiX,Y : A) ^ 2-H{q/3,q/3,q/S,l-q) 

w 0.0463 bits, (37) 

i.e., about 1/22 of a bit. 

To emphasise just how weak a degree of correlation the 
latter case represents, suppose that the observers make 22 
independent repetitions of the CHSH experiment. There 
are then 4^^ sa 2 x 10^^ possible sequences of joint mea- 
surement settings. Given knowledge of the correspond- 
ing sequence Ai, A2, . . . , A22 of underlying variables, the 
number of possible measurement settings drops by just 
a factor of two, to « 10"'^^. The correlation is, therefore, 
very subtle. This is of obvious interest in the physical 
simulation of quantum cryptographic protocols via local 
deterministic devices. 



VII. RELAXED BELL INEQUALITIES 

The previous section demonstrates that, to model the 
singlet state, one or more of the properties of determin- 
ism, nonsignaling and measurement independence have 
to be relaxed. As noted in Appendix A, these proper- 
ties must similarly be relaxed to model violations of Bell 
inequalities. Since such inequalities are directly testable, 
the question of just how much relaxation is required, for a 
given degree of violation, is studied here. The relaxation 
of outcome independence is also considered, in Sec. VII B. 
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A. Jointly relaxing determinism, no signaling and 
measurement independence 

1. Main theorem 

Let x,x' and y^y' denote possible measurement set- 
tings for a first and second observer, respectively, and 
label each measurement outcome by ±1. If {XY) de- 
notes the average product of the measurement outcomes, 
for joint measurement setting (x, y), then it is well known 
that the BcU-CHSH inequality % 

{XY) + {XY') + {X'Y) - {X'Y') < 2 

must be satisfied if the measured correlations admit an 
underlying model with I — S = M = 0. Conversely, if 
this inequality is satisfied by the measured correlations, 
then an underlying model can be constructed such that 
I = S = M = 

The joint degrees of relaxation, required to model any 
given violation of the Bell-CHSH inequality, are precisely 
quantified by the following 'relaxed' version: 

Theorem: If an underlying model exists, having val- 
ues of indeterminism, signaling and measurement depen- 
dence of at most /, S and M, respectively, then 

{XY) + {XY') + {X'Y) - {X'Y') < B{I, S, M), (38) 

with tight upper bound 

B{I, S, M) = 4 - (1 - 2/)(2 - 3M) for < 1 - 2/ 

and M < 2/3, 
= 4 otherwise. (39) 

The theorem verifies a conjecture in Ref. TS], where 
the form of B{I, S, 0) was obtained. The extension to 
arbitrary M is nontrivial, as per the proof in Appendix 
B. Noting that i3(0, 0, 0) = 2, the theorem reduces to the 
standard Bell-CHSH inequality for models satisfying de- 
terminism, no signalling and measurement independence. 

If a given value 2 -f is measured for the lefthand 
side of Eq. dSH]), thus violating the standard Bell-CHSH 
inequality by an amount V, the theorem imposes the 
strong constraint 

B{I,S,M) >2 + V. (40) 

on the joint degrees of indeterminism, signaling and mea- 
surement dependence that must be present in any corre- 
sponding model of the violation. This constraint may be 
regarded as a complementarity relation for /, S and M, 
quantifying the tradeoff required between these quanti- 
ties to model a given violation. 

Note that signaling is a useful resource for modeling a 
violation if and only if the 'gap' condition 

S > Sgap := 1 - 2/ (41) 

is satisfied. This corresponds to a degree of signaling 
sufficient for a marginal probability to shift across the 



gap between the subintervals [0,7] and [0,1 — /]. This 
property also holds for violations of other Bell inequali- 
ties (see Sec. VII C). Note further that any violation of 
the Bell-CHSH inequality can be modeled if M > 2/3. 



2. Example: measurement independent models 

The case M = has been extensively discussed else- 
where p^ . For example, a measurement independent 
model of the maximum quantum violation, V = 2^/2— 2 
in Eq. (H(71) . exists if and only if 

/ > y/4 « 0.207 and/or S>1~V/2k 0.586. (42) 

Further, the randomness and signaling capacities must 
satisfy 

Crandom > 0.736 bits, aud/or Csig > 0.264 bits, (43) 

via Eqs. © and (IT51) . Models saturating these bounds 
are given in the Appendix of Ref [l5j . 

It is of interest to compare these bounds with a commu- 
nication model recently given by Pawlowski et al., which 
in the notation of this paper corresponds to the joint 
distributions 

p{a, b\x, y, A) = p{a, b\x, y' , A) = p{a, b\x' , y, A) = Saxhx, 

p{a, b\x' , y' , A) = [p[l - 5ax) + (1 - p)5a\\ 5b\, 

with A ±1 and p := V2 - 1 « 0.414 (for arbitrary 
p e [0, 1], the corresponding violation of the Bell-CHSH 
inequality is = 2p). It is straightforward to calculate 
~ ~ p. Hence, the model is nonoptimal in the 
sense that, as per Eq. P^ . models exist with only half 
the degree of indeterminism, / = p/2 « 0.207, and no 
signaling, S = [1^1 . Note, however, that the above 
model is outcome independent, with = 0. 

The randomness capacity follows from Eq. Q as 
^random = ^ip) ~ 0.979 bits. To Calculate the signal- 
ing capacity, note that for the measurement setting a;', a 
marginal probability of the first observer shifts between 
and p, independently of A. Hence, if the second observer 
chooses between settings y and y' with prior probabili- 
ties w and w' = 1 — w, the mutual information that can 
be communicated is H\{A : Y) = h{w'p) — w' h{p), with 
h{x) as per Eq. PH)) . For p = \/2— 1 this is maximised for 
w' ~ 0.393, yielding the corresponding signaling capacity 
Cj^g w 0.256 bits. 

To compare CfJ^ with the communication capacity in 
Eq. (PO)) . note that the model is implemented via the 
second observer sending a message bit m — 0, 1 to the 
first observer, with corresponding probabilities p(m\y) — 
(5mo and p{m\y') = (1 - p)&rrifi + V^m\, independently of 
the underlying variable A [3^, |3§|. Hence, if the settings 
y and y' are chosen with prior probabilities w and w' = 
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1 — w, the mutual information between the setting and 
the message is given by 

HxiM : r, B) = H{M : Y) = h{w'p) - w' h{p), 

which is equal to H\{A : Y) calculated above. Hence, 
noting the roles of the first and second observers are re- 
versed relative to the discussion in Sec. IV D, the model 
is nonredundant, and 

C^rnmun = Cj^^ « 0.256 bitS. (44) 

Finally, it may be noted that for the choice w — w' — 1/2, 
the mutual information H{M : Y) is h{p/2) — h{p)/2 w 
0.247 bits. This corrects the value of h{p/2) « 0.736 bits 
given in Ref. Thus, fortuitously, less communication 
is required in this case than was originally thought. 

3. Example: nonsignaling models 

The class of nonsignaling models, with = 0, is of 
obvious interest. The upper bound of the theorem in 
Eq. ([38|) reduces in this case to 

0, M) = 4 - (1 - 2/) (2 - 3A/) for M < 2/3 

= 4 otherwise. (45) 

Thus, for example, a nonsignaling model exists for the 
maximum quantum violation, V = 2\p2 — 2, if and only 
if (/, M) lies on or above the hyperbola 

(l-2/)(2-3Af)==2-y = 4-2\/2 (46) 

in the /M-plane. This hyperbola has asymptotes / = 1/2 
and M = 2/3, and intersects the /-axis at / = 1^/4 and 
the M-axis at M = V/3. Hence, either / > V/i w 0.207 
or AI > V/3 w 0.276 are sufficient (but not necessary) 
conditions, for a nonsignaling model of the maximum 
quantum violation to exist. 

4- Example: local deterministic models 

It is only recently that serious attention has been paid 
to the case I = S = (see Sees. V and VI). The corre- 
sponding underlying models are both deterministic and 
nonsignaling, but have some degree of correlation be- 
tween the measurement settings and the underlying pa- 
rameter A. The upper bound of the theorem reduces in 
this case to 

B{0, 0, M) = min{2 + 3M, 4}. (47) 

This bound is saturated by the models given in Tables I 
and II of Ref. [l3| (see also Appendix B). 

It follows via Eq. (|in)) that a local deterministic model 
exists for the maximum quantum violation, V = 2-\/2 — 2, 
if and only if M > V/3 w 0.276. This corresponds to 



a fraction F = 86% of measurement independence, i.e., 
measurement independence need only be relaxed by 14%. 
Noting that the singlet state achieves this degree of viola- 
tion, it further follows that the deterministic nonsignaling 
model of singlet state correlations given in Ref. [14] (also 
discussed in Sec. VI C above), is optimal in that it has 
the smallest degree of measurement dependence possible 
for any such model. 



B. Relaxing outcome independence 

The measures /, S and M are linear with respect to 
the relevant probability distributions, making the explicit 
analytic calculation of the relaxed bound B{I, S, M) a 
tractable problem. It is much more difhcult to obtain 
corresponding bounds if / is replaced by the quadratic 
measure of outcome dependence, O, defined in Eq. 

However, for the case of models satisfying no signaling 
and measurement independence (i.e., 5' = M = 0), one 
may derive the relaxed Bell-CHSH inequality 

{XY) + {XY') + {X'Y) {X'Y') < (48) 

which holds whenever a model exists with a degree of 
outcome dependence no greater than O. 

Recalling that < O < 1 for two- valued outcomes, 
the right hand side of this inequality ranges between 2 
and 4, and reduces to the standard Bell-CHSH inequal- 
ity when outcome independence is satisfied, i.e., when 
= 0. Moreover, it follows, for a degree of violation 
V of the Bell-CHSH inequality, that a nonsignaling and 
measurement independent model exists if and only if 
4/(2 — O) > 2 + V. In particular, for the maximum 
quantum degree of violation, V = 2y/2 — 2, such a model 
exists if and only if 

2V r- 
0>- — - = 2 - V2 w 0.586. (49) 
~ 2 + V ^ ' 

Further, from Eq. (jl3p the maximum mutual information 
between the outcomes must be at least 

2 -I- 3V 

Coutcon^e > 1 " Hj:^) ~ 0.264 bitS. (50) 

To obtain the relaxed Bell inequality in Eq. (|3S]), let 
{XY)x denote the expectation value of the product of 
measurement outcomes for settings x and y, and define 

Ex := {XY)x + {XY')x + {X'Y)x - {X'Y')x. 

Defining the probabilities cj , rrij and Uj as per Appendix 
B, one has 

3 

Ex ~ 2 + 2 y~^(2cj — rrij — nj) — 2(2c4 — 7714 — ^14). 
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Further, the no-signahng assumption ahows one to 
rewrite the marginals as to := toi = TO2, to' := TO3 — m^, 
n :— ni — n^, and n' :— n2 — JH, leading to 

Ex = 2 + 4(ci +C2 + C3- C4) - 4(m + n). 



Now, noting Eqs. (jA2|) and (|A3|) . Cj must he between the 
lower and upper bounds max{0, nij+rij — 1, mjUj — 0/4} 
and min{TOj, rij, TOjTij +0/4}. Hence, replacing Cj by its 
upper bound for j — 1,2,3 and C4 by its lower bound, 
one obtains, after some simplification, the corresponding 
tight inequality 

Ex < 4: [/(I - TO, 1 - n, O) + f{m, n', O) + f{m', n, O) 
+/(m',l - n',0)] -4m' -2, 

where f{a,b,c) := min{a, 6, a6 + c/4}. The maximum 
value of the right hand side over all marginal probabilities 
TO, to', n, n' G [0, 1], for a fixed degree of outcome depen- 
dence O, is found numerically to occur when to' = 1/2 
and n — n' = 1 — 0/2. Substituting these values into 
the right hand side, and maximising over to, yields the 
upper bound 4/(2 — O), achieved for m = 3/2 — 1/(2 — 0). 
Averaging over A then yields Eq. (|48p as required. 

For the above values of to, to', n, n' one has ci = C2 = 
1 — 0/2 and C3 = C4 = 1/2, implying that a set of prob- 
ability distributions saturating Eq. (|48p is given by 



Pi = P2 



O 



1 + 



1 



1 



0'°' 2-0 



P3 



1 1 - 

2'«- 



O O 



-001 
2 ' 2 ' 2' ' 



where it is recalled from Appendix B that pi = 
p(a, y. A), p2 = p(a, y'. A), etc. This model is 
nonsignaling by construction, but is maximally indeter- 
ministic, with / = 1/2. Note that the distributions cor- 
respond to a PR-box for O = 1 [2^ . 

The corresponding outcome correlation capacity of this 
model follows via Eq. (ITT|) as 

Coutcome = 5[0/2] + .g[3/2 - 1/(2 - O)] 
-g[(l + 0)/2- 1/(2-0)], 

where g[x] := — a::log2a;, and ranges from a minimum of 
for O = to a maximum of 1 bit for O = 1. For the 
case of maximum quantum violation, O = 2 — \/2, one 
has Coutcome ~ 0.480 bits. Thus, less than half a bit of 
outcome correlation is required to model this degree of 
violation. 

It is possible, in principle, to generalise Eq. (|48p to ob- 
tain a relaxed Bell inequality corresponding to jointly re- 
laxing both outcome independence and no signaling. The 



nij and Uj now remain distinct, and subject to Eq. (IBSp . 
The corresponding bound, B{0,S), would quantify the 
complementary contributions required from jointly relax- 
ing outcome independence and no signaling, to model a 
given violation of the standard Bell-CHSH inequality. 



C. Relaxing 13322 and other Bell inequalities 



Cconsider a Bell inequality of the general linear form 



E 

a,b.,j,k 



aft,p{a,b\xj,yk) < 



where the upper bound holds for any underlying model 
with I = S ~ M = 0. It is not difficult, in principle, 
to quantify the joint degrees of relaxation of determin- 
ism and no signaling required for modeling violations of 
such Bell inequalities. This is done via determining the 
corresponding least upper bound, Ba{I, S), of Aa- 

In particular, determining Ba{I,S) may be re- 
duced to a standard linear programming problem 
(solvable in polynomial time). One defines the lin- 
ear function Aq,(A), by replacing p(a,b\xj,yk) with 
p{a,b\xj,ykT X) in the above expression for A^, and 
maximises over all joint probability distributions sub- 
ject to the linear constraints of positivity, normal- 
isation, p{a\xj,yk,X),pib\xj,yk,X) € [0,/] U [1 - 
/,1], and \p{a\xj,yk,X) - p{a\xj ,yk' , X)\,\p{b\xj ,yk, X) - 
p{b\xj,yk' , X)\ < S. The maximum value is the desired 
upper bound Ba{I, S). In particular, since p{X\xj,yk) = 
p{X) for M = 0, the integration of Aa{X) over A yields 
the relaxed Bell inequality 

Aa < Bc.{I,S). 

The case where measurement independence is also re- 
laxed is more difficult (see, eg. Appendix B for the case 
of the relaxed Bell-CHSH inequality), and a general pro- 
cedure remains to be found. 

As an example which can be treated analytically, a 
variant of the 13322 inequality obtained by Collins and 
Gisin will be considered here. The 13322 inequality is the 
canonical Bell inequality for the case of 3 measurement 
settings for each observer and two-valued measurement 
outcomes, and has the form Q 

3 

13322(0., b) := a-ikp{a,b\xj,yk) - p{a\xi) 

- 2p{b\yi) - pib\y2) < 0, 

with ajk — 1 for j + fc < 4, 023 = (^32 = ^1, and 033 — 
0. Note that this form is not suitable for dealing with 
models having a non-zero degree of signaling 5, since the 
marginals p(a I a; j) and p{b\yj) are not well defined in such 
a case (eg, one may have p(a|xj , yi. A) 7^ p{a\xj,y2, X)). 
However, multiplying by the nonnegative quantity 1 + ab 
and summing over a and b yields a suitable variant: 



A 



3322 



J2a,k {X,Yk)<A, 

3:k 



(51) 



where (XjYk) denotes the expectation of the product of 
measurement outcomes for the joint measurement setting 
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The corresponding relaxed Bell inequality is then 

A3322 < S3322(/, 5) := 4 + 8/, 5 < 1 - 2/, 

= 8 otherwise, (52) 

and is derived in Appendix C. This inequality is tight; 
reduces to Eq. ((5T|) for / = S' = 0; and is seen to be 
exactly twice the upper bound, B{I, S, M), of the relaxed 
Bell-CHSH inequality in Eq. §^ for M = 0. 

A generalisation of Eq. (|52l) to m measurement settings 
on each side is conjectured in Appendix C. 



VIII. HOW MUCH FREE WILL DO 
EPR-KOCHEN-SPECKER THEOREMS NEED? 

The original Kochen-Specker theorem showed that one 
cannot consistently assign any pre-existing measurement 
outcomes to a particular set of (117) quantum observ- 
ables on a three-dimensional Hilbert space, under the 
assumption of 'noncontextuality', i.e., that the outcome 
assigned to one observable is independent of whether or 
not it is simultaneously measured with a compatible ob- 
servable [7[ . A similar result was obtained independently 
by Bell [8|, but relying on a continuum of observables. 
Both results have the advantage of holding independently 
of the quantum state. However, as pointed out by Bell, 
the noncontextuality assumption is rather strong. For 
example, if the compatible observables are measured in 
the same local region of spacetime, then there is no com- 
pelling physical reason why simultaneous measurement 
contexts should not 'interfere' with each other in some 
way i. 

Heywood and Redhead were able to substantially 
strengthen the basis for the noncontextuality assump- 
tion, by only requiring that it hold for observables mea- 
sured in spacelike separated regions, and restricting at- 
tention to quantum states for which these observables 
were perfectly correlated dl. Thus, they were able to 
effectively replace (or justify) noncontextuality, in their 
version of the Kochen-Specker theorem, via the physi- 
cally more plausible assumption of no signaling - albeit 
at the mild expense of having to restrict attention to 
particular quantum states. Note also that, as per the 
argument for 'elements of reality' by Einstein, Podol- 
sky and Rosen (EPR) [13, perfect correlations between 
distant observables motivate why one might wish to as- 
sign pre-existing measurement outcomes in the first place 
(see also Sec. Ill A). Hence, the Heywood- Redhead re- 
sult, and later simplified versions, may be referred to as 
'EPR-Kochen-Specker theorems'. 

EPR-Kochen-Specker theorems are seen to rely on 
assumptions esentially equivalent to determinism (pre- 
existing outcomes), and no signaling (each outcome is 
independent of what is measured in a spacelike sepa- 
rated region). They in fact also rely on a further assump- 
tion, only first made explicit by Conway and Kochen [12| : 
that experimenters can freely choose to measure any of 



the observables in question. Thus, an assumption im- 
plying measurement independence is also required. All 
such theorems have, therefore, similar significance to Bell 
inequalities. 

However, EPR-Kochen-Specker theorems are distin- 
guished from Bell inequalities in the important respect 
that they are not statistical in character: they show that 
particular correlated observables cannot be logically as- 
signed any set of fixed outcomes, irrespective of the prob- 
abilities of these outcomes. Hence, relaxing the assump- 
tions of determinism or no signaling would contradict 
the essence of these theorems. In contrast, it is natural 
to consider by how much the degree of measurement in- 
dependence must be relaxed, to be able to consistently 
assign such a set of pre-existing measurement outcomes. 

It is shown below that an EPR-Kochen-Specker theo- 
rem due to Mermin [Toj is quite robust: one must relax 
measurement independence by at least 1/3 to allow pre- 
existing measurement outcomes to be assigned. In con- 
trast, the Conway-Kochen 'free will' theorem [l^] and a 
theorem due to Hardy [TTI| fail if measurement indepen- 
dence is relaxed by only 6.5% and 4.5%, respectively. 



A. Relaxing Mermin's theorem 

Mermin gave an EPR-Kochen-Specker theorem for 
three mutually spacelike separated observers, who may 
be labelled Alice, Bob and Charlie. The observers con- 
duct a joint experiment where Alice measures one of two 
observables A, A', Bob measures one of two observables 
B, B' , and Charlie measures one of two observables C, C", 
with each observer's outcome labelled by ±1. The ob- 
servables are assumed to exhibit the perfect correlations 

{ABC) = {AB'C) = {A'BC) = 1, {A'B'C) = -1, 

(53) 

where (XYZ) denotes the expectation value of the prod- 
uct of the outcomes of observables X, Y and Z. Such cor- 
relations can be implemented quantum mechanically, for 
example, when A, A', B, B' , C, C correspond to the spin- 
1/2 observables a^,ay,<T^,ay,a^,ay, respectively, and 
the observers share the tripartite state \tp) defined by 
the -|-1 eigenvalues of the commuting operators a^a^ay , 
a^a^a^^ and a>f M- 

Mermin argued that, if the existence of an underlying 
nonsignaling model is assumed, 'one is impelled to con- 
clude' that the measurement outcomes are predetermined 
Of course, one is not compelled to conclude this: de- 
terminism does not logically follow from the combination 
of no signaling and perfect correlations, as discussed in 
Sec. HI A. However, if the model is assumed to be de- 
terministic, then the outcomes of A, A' , B, B,C,C' are 
fixed prior to any measurements, and may be denoted by 
a, a' , b, b' , c, c' — ±1 for any given run of the experiment. 
The perfect correlations then appear to imply that 

abc' = ab'c = a' be = 1, a'b'e' = -1, (54) 
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TABLE I: A class of local deterministic models for Mermin's 
correlations 
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which is clearly inconsistent for any assignment of values 
[lOj (since the product of the first three equations gives 
a'b'c' = 1). It therefore seems that there is no determin- 
istic nonsignaling model of the correlations. 

However, the derivation of Eq. (j54p in fact requires a 
further assumption, not explicitly discussed by Mermin: 
that Alice can always choose which one of A and A' to 
measure in each run of the experiment, and similarly for 
Bob and Charlie. If this assumption is not made, it is 
in fact possible to construct a deterministic nonsignaling 
model of the correlations in Eq. (|53p , as is demonstrated 
in Table I. 

The model in Table I has an underlying variable 
A taking four possible values, Ai,A2,A3,A4. For each 
Xj the corresponding measurement outcomes are de- 
terministically and locally specified, via 12 fixed num- 
bers aj,bj,Cj — ±1. The underlying probability den- 
sity, p{X\A, B,C'), corresponding to a joint measure- 
ment of A, B and C is denoted by pabc, and sim- 
ilarly for the other joint measurements appearing in 
Eq. ([55)) . It is easily checked that this model reproduces 
the perfect correlations in Eq. ([53| with, eg, (ABC) — 
E,PAsc'(A,)A(A,)i?(A,)C"(A,) = l. 

Hence, there is indeed a deterministic nonsignaling 
model for these correlations, as claimed. However, this is 
at the cost of relaxing measurement independence, i.e., 
of introducing correlations between the measurement set- 
tings and the underlying variable (see Sec. V). For exam- 
ple, from Table I the joint measurement of A' , B' and C" 
cannot be performed if the underlying variable is equal 
to Ai. 

The degree of measurement dependence of the model 
may be calculated via Eq. as M = 2/3, correspond- 
ing to a fraction _F = 2/3 of measurement independence 
in Eq. (1271) . Thus, one third of measurement indepen- 
dence must be given up. The corresponding measurement 
dependence capacity may also be calcuated, via Eq. (^5]). 
as Cmeas dep — log2 4/3 « 0.415 bits (achieved by choos- 
ing between the four possible joint measurements with 
equal probabilities). Thus, less than half a bit of correla- 
tion is required between the settings and the underlying 
variable. 

It is important to note that the above model does not 
simulate the Mermin state lip); nor is that the aim here. 
The much more modest aim is to calculate to the degree 
to which measurement independence must be relaxed to 
overcome the conclusions of Mermin's theorem, i.e., to 



provide a local deterministic model of the perfect corre- 
lations in Eq. (15^ . However, it would certainly be of 
interest to generalise the local deterministic model of the 
singlet state in Sec. VI C, to find a similar optimal model 
for Mermin's state. 



B. Relaxing the 'free will' theorem 

Conway and Kochen have given a theorem of the same 
ilk as Mermin's theorem above, the main differences be- 
ing (i) only two observers are required, and (ii) the need 
for a further assumption such as 'free will' is explicitly 
noted [l3- However, it will be seen that this 'free will' 
theorem is weaker than Mermin's theorem, in the sense 
that measurement independence needs only to be relaxed 
by 6.5% to give a local deterministic model of the corre- 
lations. 

Briefly, Conway and Kochen consider two distant ob- 
servers, each of whom measures a two- valued observable 
labeled by members of a particular set of unit 3- vectors, 
with possible measurement outcomes or 1. The out- 
comes are assumed to exhibit perfect correlations when 
the same measurement direction is chosen by both ob- 
servers, i.e., 

p{a = b\x, x) — 1. 

It is further assumed that the measurements correspond- 
ing to any orthogonal triple of measurement directions, 
x,y,z say, can be performed simultaneously by either 
observer, and always give the outcomes 1,0,1 in some 
order. Such correlations can be implemented quantum 
mechanically, for example, via the observers sharing a 
pair of spin-1 particles in a state of total spin zero, where 
the observable labeled by direction j corresponds to the 
square of the spin observable in that direction [l^] . 

Conway and Kochen show there is a particular set of 
33 measurement directions, -D33, for which there is no 
underlying model of the above correlations which sat- 
isfies determinism, no signaling and measurement inde- 
pendence. They conclude that particles have 'exactly the 
same kind' of free will as experimenters, where both in- 
determinism and measurement independence are equated 
with 'free will', for particles and experimenters respec- 
tively. However, a model having 0% indeterminism and 
93.5% measurement independence is given below. 

In particular, to construct a deterministic nonsignaling 
model of the above correlations, note first that is 
minimal in the sense observed by Peres [39]; for each 
direction w e D^^ there exists a corresponding function 
6'm(x), from D33 to {0, 1}, such that 

eUx) + 0wiy) + 0^{z) ^ 2 

for any mutually orthogonal triple (x, y, z) satisfying 
x,y, z ^ w. Hence, consider a model having the un- 
derlying joint probabilities 

p(a, b\x, y, X^) := 5a^e^{x) ^b,e^(y)-, 
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where the possible values of the underlying variable are 
labeled by w S D33. This model is clearly determin- 
istic and nonsignaling, and satisfies p{a = b\x,x) = 1 
as required. Further, by construction, the outcomes for 
a simultaneous measurement of any mutually orthogonal 
triple {x, y, z) must be 1, 0, 1 in some order, provided that 
no member of the triple is equal to w. Finally, the latter 
provisio may be guaranteed to hold in any actual joint 
measurement by defining the probability distribution of 
the underlying variable to be 

p{Xw\x,y) := 0, w = X or w = y, 

■■= -gl" H Otherwise. 

Hence, no measurement can be made in the direction 
corresponding to the label of the underlying variable. 

The degree of measurement dependence of the above 
model can be calculated via Eq. ((26|) as M — 4/31, 
achieved for the case of joint measurements (x, y), {x' , y') 
having no directions in common. This corresponds to a 
fraction F — 29/31 « 93.5% of measurement indepen- 
dence in Eq. (I27p . i.e., measurement independence only 
needs to be relaxed by « 6.5%. The measurement de- 
pendence capacity can be estimated via Eq. ([25]) as 

33 

Cmeas dep — ^max{^) ^rain{-^) — lo§2 ' 

where the upper entropy bound follows from taking 33 
possible values, and the lower bound corresponds to any 
joint setting with x ^ y. Thus, ~ 0.0902 bits - less than 
one tenth of one bit of correlation - is required between 
the underlying variable and the measurement settings. 

C. Relaxing Hardy's theorem 

Finally, it is of interest to also consider a result due to 
Hardy, which derives an EPR-Kochen-Specker theorem 
having a minor statisical element [Tl!|. In particular, first 
and second observers each measure one of two observ- 
ables Uj and Z3j, where j — 1,2 refers to the observer. 
Labelling the corresponding measurement outcomes by 
Uj , = or 1 , it is assumed that they satisfy the per- 
fect correlations 

U1U2 = 0, di = 1 M2 = 1, 6,2 = I ^ Ui = 1, 

and further that the joint outcome di ~ d2 ~ 1 can 
occur with some probability 7 > 0. Such correlations can 
be implemented quantum mechanically via the observers 
sharing one of a large class of two-qubit states, providing 
that [nl 

l<lmax := (5%/5-ll)/2«9%. 

Hardy argues that there is no deterministic nonsignal- 
ing model of such correlations, on the grounds that such 
a model must predict values di = 1 = ^2 in at least some 



TABLE IL A class of local deterministic models for Hardy's 
correlations [note 7' := (1 — 7)/2] 
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instances, which is incompatible with any simultaneous 
assignation of values of ui and U2 as per the required cor- 
relations TT]. However, this argument makes an implicit 
assumption that the model is measurement independent. 
If this assumption is relaxed, it is quite straightforward 
to write down deterministic nonsignaling models of the 
correlations, as is done in Table 2. 

The class of models in Table 2 is defined via an under- 
lying variable A taking 5 possible values Ai, A2, . . . , A5, 
and corresponding deterministic outcomes specified by 
two numbers a, 6 = or 1 (thus, there are four distinct 
models, corresponding to the choices of a and h). The 
underlying probability distribution p{X\U, U) is denoted 
by puu^ E^nd similarly for the other joint settings (C/, -D), 
(Z?, U) and (Z?, D). The required correlations can all be 
checked to hold whenever they can be measured. For 
example, U1U2 = identically except for A = A5, but 
the probability of A = A5 vanishes for the corresponding 
setting ([/,[/). 

The associated degree of measurement dependence is 
easily calculated via Eq. as M = 7, with associ- 
ated fraction of measurement independence = 1 — 7/2. 
Hence, measurement independence need only be relaxed 
by at most 7maa/2 ~ 4.5% to model the correlations. 
One can also estimate the degree of correlation required 
between the underlying variable and the measurement 
settings via 

3 

Cmeas dep < Hmax{^) ^ H„iin{^) = 7 log2 2 ~ 0.5857. 

Here the maximum entropy value corresponds to choos- 
ing between the four joint settings with equal probabili- 
ties, while the minimum value corresponds to the {D, D) 
setting. For 7 = "fmax this gives a bound of « 0.053 bits. 

IX. CONCLUSIONS 

The main aim of this paper has been to carefully define 
the quantitative degrees to which certain physical proper- 
ties hold for underlying models of statistical correlations 
(Sees IH-V), and to show how these may be applied to de- 
termine optimal singlet state models (Sec. VI); the mini- 
mal degrees of relaxation required to simulate violations 
of various Bell inequalities (Sec. VII); and the relative 
robustness of Kochen-Specker theorems (Sec. VIII). The 
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results help to both clarify and quantify the nonclassical 
nature of quantum correlations, including the resources 
required for their simulation. 

A number of possible directions for future work are 
suggested by the results of the paper. First, while 
the information-theoretic measures defined in Sees. III- 
V quantify various resources required to simulate cor- 
relations, little is known about the interconversion of 
these resources. For example, while Barrett and Gisin 
show how a communication model may be converted 
into a measurement dependent model [s^ (see also pol|). 
with Ccommun = Cmeas dep, it is uot clear how to pro- 
ceed in the reverse direction. Nor has the conjecture 
Csig + C random > 1 bit [l^ [s^ , for measurement in- 
dependent models of singlet state correlations, yet been 
proved. 

Second, for signaling to be a useful resource for mod- 
eling violations of standard Bell inequalites in Eqs. (l38t . 
(|52l) and jCSJ, the 'gap' condition S" > 1 - 2/ in Eq. (|4T|) 
must be satisfied . This condition corresponds to signal- 
ing of a degree sufficient to be able to 'flip' a marginal 
probability from p to 1 — p, and it would be of interest 
to know whether it generalises to all Bell inequalities. 

Third, it has been seen in Sees. VI- VIII that the re- 
laxation of measurement independence is a remarkably 
strong resource for modeling quantum correlations. For 
example, as per Eq. ([37]) . one requires a correlation be- 
tween the measurement settings and the underlying vari- 
able of only 1/22 of a bit, to obtain a local deter- 
ministic model of the CHSH scenario. It would be of 
interest to exploit such a model to simulate quantum 
cryptographic protocols. It would similarly be of interest 
to generalise the local deterministic model of the singlet 
state, discussed in Sec. VI, to find corresponding optimal 
models for the quantum states that generate the perfect 
correlations in Sec. VIII. Presumably, the required degree 
of relaxation of measurement independence will increase 
with Hilbert space dimension, to some saturating value 
M* < 2. It is not known if M* < 2. 

Finally, it would be of interest to generalise the relaxed 
Bell inequality in Eq. (|151) . to include the relaxation of 
no signaling and measurement independence, similarly to 
the analogous inequality in Eq. (|38|) . This would also al- 
low determination of whether the model of Pawlowski et 
al. [IP], discussed in Sec. VI A, has the minimal possi- 
ble degree of signaling for the case O = M = {). An- 
other reason for pursuing such a generalisation, despite 
the technical difficulties due to the quadratic nature of O 
in Eq. ([6]), is that the degrees of relaxation O, S and M 
are completely independent of one another, whereas the 
quantities / and S are mutually constrained via Eq. ()16|) . 

Acknowledgements I thank N. Gisin and C. Bran- 
ciard for stimulating discussions. 



Appendix A: Determinism vs outcome independence 

As noted in Sec. Ill, any set of statistical correlations 
admits a deterministic model if and only if it admits 
an outcome independent model. A brief proof is given 
here. This result further implies that derivations of Bell 
inequalities based on outcome independence (or factoris- 
ability) are no more general than derivations based on de- 
terminism. A proof of the relation in Eq. ([7]) , linking the 
measures of indeterminism / and outcome dependence 
O, is also given. 

Proposition: For any set of statistical correlations 
{p{a,b\x,y)} , there exists an underlying model satis- 
fying determinism if and only if there exists an underly- 
ing model A4' satisfying outcome independence. Further, 
these models "commute" with the properties of no sig- 
naling and measurement independence, i.e., M satisfies 
either of these properties if and only if M' does. 

Proof: Suppose first one has a model satisfying out- 
come independence, as per Eq. (jH). Choosing some 
fixed ordering of the possible results, {aj\ and for 
each measurement, define a corresponding deterministic 
model via: (i) the underlying variable 

A=(A,a,/3), 

where a and 13 take values in the interval [0, 1); (ii) the 
corresponding probability density 

p(A|a;, y) = p(A, a, I3\x, y) := p{\\x, y), 

for A (i.e., a and /? are uniformly and independently dis- 
tributed over the interval [0,1)); and (iii) deterministic 
joint probabilities p{aj, bk\X) equal to unity if and only if 
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p{ai\x,y,\), ^p{a,\x,y,X) 



i<3 



i<j 



^p{bi\x,y,X), ^p{b,\x,y,X) 



.i<k 



i<k 



are satisfied (and equal to zero otherwise). It is trivial to 
check that, by construction, for any pair of measurements 
x and y one then has 

p{aj,bk\x,y) = j dXp{X\x,y)p{aj\x,X)p{bk\y,X). 

Hence, there is a deterministic model as claimed. Fur- 
ther, p{a\x,y,X) and p{a\x,y, X) satisfy the no-signaling 
conditions in Eq. (|T4)) if and only if p{a\x, y, A) and 
p{b\x,y,X) do, while p{X\x,y) satisfies the measurement 
independence condition in Eq. (j24p if and only if p(A|a;, y) 
does. Finally, the converse is trivial, since any deter- 
ministic model is automatically an outcome indepen- 
dent model. In particular, dropping explicit x, y, and 
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A dependence, suppose that p{a),p{b) € {0,1}. Then 
p(a, b) is no greater than either of p(a) and implying 
p{a, 6) = if one of the marginals vanishes. Otherwise 
p{a) — p{b) = 1, and so 1 > p{a, b) = p{a) + p{b) — p{a V 
b) > p{a) +p{b) -1 = 1. Thus, p{a,b) = p{a)p{b) in all 
cases, i.e., outcome independence is satisfied, o 

The above proposition is a simple generalisation of ex- 
isting results in the literature for single measurements 
d, [4l[ , and can be straightforwardly further generalised 
to continuous ranges of measurement outcomes and more 
than two observers. Note that the assumed ordering 
means that the model is (locally) contextual |4l| . Fine 
has previously used a rather different (nonlocally contex- 
tual) construction to obtain a form of the proposition 
for the case of four measurement pairs 'sT] , which can be 
generalised to the case of a countable set of measurement 
pairs ||42]. In contrast, the above proposition applies to 
arbitrary sets of measurement pairs, such as spin mea- 
surements in all possible directions (and does not require 
no-signaling or measurement independence assumptions 
as per Fine). 

It follows that all derivations of Bell inequalities make 
assumptions equivalent to, or stronger than, the exis- 
tence of an underlying model satisfying determinism, no 
signaling and measurement independence. This is some- 
times prima facie clear [IH^ 0| . While some derivations 
are based on measurement independence and the factoris- 
ability property p(a, b\x, y, A) = p{a\x, X)p{b\y, A) [l,[ll], 
this latter property is equivalent to the combination of 
outcome independence and no signaling in Eqs. ^ and 
([13]), which by the above proposition is equivalent to the 
existence of a deterministic nonsignaling model. Finally, 
some derivations are based on assuming the existence 
of underlying joint probability distributions for counter- 
factual measurement settings [1, |4l[, however. Fine has 
shown this is also equivalent to the existence of an un- 
derlying model satisfying determinism, no signaling and 
measurement independence ^37.] . 

To demonstrate the relation between the degrees of 
indeterminism and outcome dependence in Eq. ([T]), for 
the case of two-valued measurements, denote the possi- 
ble outcomes by ±1 and order the joint measurement out- 
comes as (-I-, +), {+, — ), (—,-!-), (—,—). The correspond- 
ing joint probability distribution for joint measurement 
setting {x, y) can then be written in the form 

p{a, b\x, y,\) = {c,m — c,n — c,l + c — m — n), (Al) 

where m and n denote the corresponding marginal prob- 
abilities for a -|-1 outcome. The positivity of probability 
implies that 

max{0, m + n — \} < c < min{m, n\. (A2) 

The degree of outcome dependence for a particular model 
follows from Eq. ^ as 

O = 4 sup |c — mn|, (A3) 

where the supremum is over all possible triples (c, to, n) 
generated by the model. 



Now, writing to = 1 — m and n = 1 — n, Eq. (|A2[) is 
equivalent to 

— min{TOn,TOn} < c — mn < min{TOn, tott,}, 

and hence |c — mn\ can be no greater than the modulus 
of either bound. But the modulus of the lower bound is 
mn for to -|- n < 1 and to n for to -|- n < 1, with a similar 
result for the upper bound, yielding 

|c — mn\ < max{ui;|u + v < 1, u G {to, to}, v G {n,rT}} . 

For models having a degree of indeterminism /, one has 
m,nG [0, /]U[1 — /, 1] from Eq. ([5]). Hence, the righthand 
side has a maximum of /(I — /), corresponding to u = 
1 - w = / (or 1 - /). This yields O < 4/(1 - /) via 
Eq. (jA3p . as required. 

The joint distributions achieving the maximum value 
of outcome dependence, O — 4/(1 — /), follow as 
(/, 0, 0, 1 - /), (1 - /, 0, 0, /), (0, /,/-/, 0), and (0, 1 - 
1,1,0). Note that these distributions are either per- 
fectly correlated, with p{a — b) — 1, or perfectly anti- 
correlated, with p{a = —b) = 1. 



Appendix B: Proof of relaxed Bell-CHSH inequality 

To obtain Eqs. (p8| and (p9)) of the theorem in 
Sec. VII A, first write the joint probability distribution 
for joint measurement setting {x,y) as per Eq. (jAll) . 
If {XY)x denotes the average product of the measure- 
ment outcomes, for a fixed value of A, then {XY)\ — 

1 + Ac - 2{m + n). It follows from Eq. (K2\i . noting 

2 max(a;, y) ^ x + y + \x — y\, that 

2|m-f n- 1| - 1 < < 1 - 2|m- n|, (Bl) 

where the upper and lower bounds are attainable via suit- 
able choices of c. 

It is convenient to label the four measurement settings 
{x,y), {x,y'), {x',y) and {x',y') by 1, 2, 3 and 4, and to 
write pi = p{a,b\x,y, X), P2 = p{a,b\x,y' , X), etc., and 
Pi(A) EEp{X\x,y), P2(A) =p{X\x,y'), etc. Defining 

T(A) := Pi{X){XY)x + P2{X){XY')x + P3{X){X'Y)^ 
-P4iX){X'Y')x, 

it immediately follows via Eq. (|B1[) that 

r(A) < Pi (A) + P2(A) + P3(A) + P4(A) - 2J(A), 

where 

J := Pi|TOi-?li|+P2|TO2-n2|+P3|TO3-7l3|+P4|TO4+n4-l| 

(B2) 

and the upper bound is attained via the choices Cj — 
min{TOj , rij} for j = 1,2,3 and C4 = max{0, TO4 -f 724 — 1}. 
Note that Pj, rrij, rij and Cj are all functions of A. 
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Hence, the quantity on the left hand side of Eq. (p8)) 
satisfies 

E := {XY) + {XY') + {X'Y) - {X'Y') 

= J d\T{X)<4-2 J dXJiX). (B3) 

Thus, maximising this quantity corresponds to minimis- 
ing the integral of the positive quantity J(A) in Eq. (|B2|) . 
This minimum will now be determined, subject to the 
constraints imposed by the statement of the theorem, 
i.e., 

mj,nj e [0,/]U[l-/, 1], (B4) 
|mi - m2|, |to3 - rriil, \ni - ns], \n2 - n4\ < S, (B5) 

J dX \P,{X) - Pk{X)\ < M. (B6) 

To proceed, suppose first that 5 > 1 — 21. One may 
then take J(A) = in Eq. (|B2|) . consistently with the 
above constraints, via the choices nij = rij = = 1 — 
^4 = / (or 1 - /), for j = 1, 2, 3. Hence, Eq. (IB3| yields 
the tight bound i? < 4 for this case, for any Pj{X), as per 
the theorem. Equality is obtained when, for example, 

pi=P2=P3 = (/, 0, 0, 1 - /), P4 = (0, /, 1 - /, 0). (B7) 

Conversely, suppose that S < 1 — 2/. From the analysis 
of this case for M = in Ref. [15], at least one of the 
four absolute values in Eq. (jB2[) for J must be non-zero, 
for each A, with a minimum value of 1 — 21, while the 
other three absolute values can be chosen to vanish. For 
example, choosing rrij = Uj — I (or 1 — /), for j — 
1,2,3,4, gives J(A) = -P4(A) (1 - 21). More generally, 
choosing the non- vanishing absolute value to correspond 
to the smallest multiplier Pj in Eq. (|B2p . for each value 
of A, one obtains the tight bound 

J(A) > (1 - 2/)min{Pj(A)}, 

j 

leading via Eq. (jB3[) to the tight bound 

£; < 4 - 2(1 - 2/) J dX min{P, (A)}. 

Eq. (155)) immediately follows, providing that the tight 
bound 

J dX mm{Fj (A)} > max{0, 1 - 3A//2} (B8) 

can be established. This will now be done. 

First, since 2 min(a:;, y) — x + y ~ \x — y\, one has in 
general that 

mhi{w , X , y , z} = min {min{z/;, a;}, min{y, z}} 

1 • r T 1 • r T 

= — mmjw, x| + — mmjj/, z| 

— — |min{u;, x} — min{y, z} \ . 



Suppose that w < x. Then if y < z the 'absolute value' 
term above is equal to — y|, while ii y > z, the six 
possible orderings wxzy, wzxy, wzyx, zwxy, zwyx, zywx 
are easily checked to yield an absolute value term no 
greater than |w — y| in the first 3 cases and no greater 
than \x — z\ in the second 3 cases. It follows that 

|min{?«, x} — min{?/, 2:}| < — ?/| + ja: — z| 

for w < x. But swapping w with x and y with z does 
not change either side, implying that this inequality also 
holds for X < w. Thus, in general, 

min{w, X, y, z} > - mm{w , x} + - mm{y , z} 
-^\w-y\ ~ \^x-z\ 
~ ^(w + 2; + J/ + zj — -|w — a;| 

--^y- A-\\w-v\-\\^- A- 

Substituting w = Pi(A),x = f2(A), etc., integrating over 
A, and using the measurement dependence constraint in 
Eq. (jB6p , then yields Eq. (jBSp as desired (noting that the 
left hand side of this equation is necessarily nonnegative) . 

It still remains to show that the bound in Eq. (IBSP is 
tight. First, for Af > 2/3 one needs to find suitable -Pj(A) 
such that minj{Pj(A)} = for all A. This is achieved, 
for example, via a model with 4 underlying variables, 
Ai,...,A4, as per Table II of Ref. [l3l- In particular, 
choosing Pj{Xk) to be p for j — k, for j + k = 5, 
and (1 — p)/2 otherwise, with < p < 1/3, one easily 
finds that M = 2 — 4p, which ranges over the interval 
[2/3, 2] as desired. Finally, for Af < 2/3, consider a model 
with 5 underlying variables, Ai, . . . , A5, as per Table I of 
Ref. [13, i.e., with Pj(Afc) = 1 - 3p for fc = 5, for 
j + k = 5, and p otherwise, again with < p < 1/3. One 
easily finds that M = 2p, which ranges over the interval 
[0, 2/3], with equality in Eq. (jBSP as required. 



Appendix C: Relaxed Imm22 inequalities 

Here the relaxed Bell inequality of Eq. ([5^ . related to 
-^3322, is proved, and a generalisation to the case of m 
measurement settings for each observer is conjectured. 

It is convenient to write the joint distribution 
p(a, b\xj,yk,X) as per Eq. (|Aip . with c, m and n replaced 
by Cjk, riijk and rijk. Eqs. ([^Tjl and (IBip immediately im- 
ply that 

^3322(A) <8-2X, (CI) 
with equality for suitable choices of Cjk, where 

if := ^ \mjk -njk\ + 117123 + n23-l\ + \ms2 + n32-l\- 
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Hence, the minimum possible value of K must be deter- 
mined, subject to the constraints mj].,nj^. g [0, / U [1 — 
/, 1] and |mjfc - mjfc/|, \njk - nj'fc| < 5". 

Defining Fjfc := \mjk-njk\ and G^fe |rnjfc - 1|, 
one has 

2K — [Fii + i^i3 + F21 + G23] + [-F21 + -F13 

+F22 + G23] + [Fii + i^i2 + F31 + G32] 
+ [i^21 + F22 + -F3I + G32]. 

Now, each of the square bracket terms corresponds to a 
particular case of the quantity J defined in the Appendix 
of Ref . [15I I , which was shown there to have a minimum 
value of 1 — 2 J for 5' < 1 — 2/ and otherwise, under the 
corresponding constraints. But for S* < 1 — 2/ these min- 
imum values are simultaneously achieved by the choices 
TOjfe = — I, while for S > 1 — 2/ they are simul- 
taneously achieved by choosing rrijk — rijk — I when 
j + k < 4:, and mjk = 1 — rijk = I for j + k — 5. Eq. ([52|) 
of the text immediately follows via Eq. (|C1I) and integra- 
tion over A. 

A plausible generalisation of Eq. ([52|) corresponds to 
relaxing a variant of the more general Imm22 Bell inequal- 
ity This inequality holds for a choice of m measure- 
ment settings for each observer, with two-valued mea- 
surement outcomes, and with the general form 

rn 

Imm22ia,b) := q;^."V(q, 6|xj-,7/fc) -p(a|a:i) 

-^(m-fc)p(&|yfc)<0, 
k 



where ctj™ = 1 for j + k < m + 1, or^ = — 1 for j + k = 

m + 2, and aj™'' — otherwise. 

As for /3322, the marginal probabilities in the above 
inequality are not well defined for a non-zero degree of 
signaling, and hence it is convenient to consider the vari- 
ant obtained via multiplication by 1 -|- a6 and summation 
over a,b — ±1, i.e., 

m 

Amm22 ■= ^ Cijk (XjYk) < -jn{ni - 1) + 1. 

3,k=l 

Note that this is equivalent to the standard Bell-CHSH 
inequality for m = 2. 

It is conjectured that the corresponding relaxed Bell 
inequality is 

<Bmm22{I,S), (C2) 

where 

B,nrn22{I,S) ^ (to - 1) (m + 8/) + 1 , 5 < 1 - 2/, 
= — (to — l)(r7i + 4) + 1, otherwise. 

This reduces to Eq. ^ for to = 2 (with M = 0), and 
to Eq. ((52|) for m — 3. Note that the upper bound is 
obtained for S < 1-1-2/ via the choice nij^. — rij^. — I, 
and for > 1 — 2/ via the choices nijk — rijk = I when 
i + k < TO + 1 and rrijk = 1 — njk = I when j + k — to + 2. 
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